Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 65
1.
J Speech Lang Hear Res ; : 1-27, 2024 Mar 08.
Article En | MEDLINE | ID: mdl-38457261

PURPOSE: One of the strategies that can be used to support speech communication in deaf children is cued speech, a visual code in which manual gestures are used as additional phonological information to supplement the acoustic and labial speech information. Cued speech has been shown to improve speech perception and phonological skills. This exploratory study aims to assess whether and how cued speech reading proficiency may also have a beneficial effect on the acoustic and articulatory correlates of consonant production in children. METHOD: Eight children with cochlear implants (from 5 to 11 years of age) and with different receptive proficiency in Canadian French Cued Speech (three children with low receptive proficiency vs. five children with high receptive proficiency) are compared to 10 children with typical hearing (from 4 to 11 years of age) on their production of stop and fricative consonants. Articulation was assessed with ultrasound measurements. RESULTS: The preliminary results reveal that cued speech proficiency seems to sustain the development of speech production in children with cochlear implants and to improve their articulatory gestures, particularly for the place contrast in stops as well as fricatives. CONCLUSION: This work highlights the importance of studying objective data and comparing acoustic and articulatory measurements to better characterize speech production in children.

2.
J Acoust Soc Am ; 155(3): 2209-2220, 2024 Mar 01.
Article En | MEDLINE | ID: mdl-38526052

Previous studies of speech perception revealed that tactile sensation can be integrated into the perception of stop consonants. It remains uncertain whether such multisensory integration can be shaped by linguistic experience, such as the listener's native language(s). This study investigates audio-aerotactile integration in phoneme perception for English and French monolinguals as well as English-French bilingual listeners. Six step voice onset time continua of alveolar (/da/-/ta/) and labial (/ba/-/pa/) stops constructed from both English and French end points were presented to listeners who performed a forced-choice identification task. Air puffs were synchronized to syllable onset and randomly applied to the back of the hand. Results show that stimuli with an air puff elicited more "voiceless" responses for the /da/-/ta/ continuum by both English and French listeners. This suggests that audio-aerotactile integration can occur even though the French listeners did not have an aspiration/non-aspiration contrast in their native language. Furthermore, bilingual speakers showed larger air puff effects compared to monolinguals in both languages, perhaps due to bilinguals' heightened receptiveness to multimodal information in speech.


Multilingualism , Speech Perception , Language , Linguistics , Speech , Speech Perception/physiology , Humans
3.
Phonetica ; 81(1): 43-80, 2024 Feb 26.
Article En | MEDLINE | ID: mdl-37934113

This is an acoustic and articulatory study of the two rhotic schwas in Southwestern Mandarin (SWM), i.e., the er-suffix (a functional morpheme) and the rhotic schwa phoneme. Electromagnetic Articulography (EMA) and ultrasound results from 10 speakers show that the two rhotic schwas were both produced exclusively with the bunching of the tongue body. No retroflex versions of the two rhotic schwas were found, nor was retraction of the tongue root into the pharynx observed. On the other hand, the er-suffix and the rhotic schwa, though homophonous, significantly differ in certain types of acoustic and articulatory measurements. In particular, more pronounced lip protrusion is involved in the production of the rhotic schwa phoneme than in the er-suffix. It is equally remarkable that contrast preservation is not an issue because the two rhotic schwas are in complementary distribution. Taken together, the present results suggest that while morphologically-induced phonetic variation can be observed in articulation, gestural economy may act to constrain articulatory variability, resulting in the absence of retroflex tongue variants in the two rhotic schwas, the only two remaining r-colored sounds in SWM.


Acoustics , Larynx , Humans , Speech Acoustics , Phonetics , Tongue
4.
medRxiv ; 2023 Feb 01.
Article En | MEDLINE | ID: mdl-36778502

Atypical eye gaze in joint attention is a clinical characteristic of autism spectrum disorder (ASD). Despite this documented symptom, neural processing of joint attention tasks in real-life social interactions is not understood. To address this knowledge gap, functional-near infrared spectroscopy (fNIRS) and eye-tracking data were acquired simultaneously as ASD and typically developed (TD) individuals engaged in a gaze-directed joint attention task with a live human and robot partner. We test the hypothesis that face processing deficits in ASD are greater for interactive faces than for simulated (robot) faces. Consistent with prior findings, neural responses during human gaze cueing modulated by face visual dwell time resulted in increased activity of ventral frontal regions in ASD and dorsal parietal systems in TD participants. Hypoactivity of the right dorsal parietal area during live human gaze cueing was correlated with autism spectrum symptom severity: Brief Observations of Symptoms of Autism (BOSA) scores (r = âˆ'0.86). Contrarily, neural activity in response to robot gaze cueing modulated by visual acquisition factors activated dorsal parietal systems in ASD, and this neural activity was not related to autism symptom severity (r = 0.06). These results are consistent with the hypothesis that altered encoding of incoming facial information to the dorsal parietal cortex is specific to live human faces in ASD. These findings open new directions for understanding joint attention difficulties in ASD by providing a connection between superior parietal lobule activity and live interaction with human faces. Lay Summary: Little is known about why it is so difficult for autistic individuals to make eye contact with other people. We find that in a live face-to-face viewing task with a robot, the brains of autistic participants were similar to typical participants but not when the partner was a live human. Findings suggest that difficulties in real-life social situations for autistic individuals may be specific to difficulties with live social interaction rather than general face gaze.

5.
Clin Linguist Phon ; 37(2): 169-195, 2023 02 01.
Article En | MEDLINE | ID: mdl-35243947

Speech sound disorders can pose a challenge to communication in children that may persist into adulthood. As some speech sounds are known to require differential control of anterior versus posterior regions of the tongue body, valid measurement of the degree of differentiation of a given tongue shape has the potential to shed light on development of motor skill in typical and disordered speakers. The current study sought to compare the success of multiple techniques in quantifying tongue shape complexity as an index of degree of lingual differentiation in child and adult speakers. Using a pre-existing data set of ultrasound images of tongue shapes from adult speakers producing a variety of phonemes, we compared the extent to which three metrics of tongue shape complexity differed across phonemes/phoneme classes that were expected to differ in articulatory complexity. We then repeated this process with ultrasound tongue shapes produced by a sample of young children. The results of these comparisons suggested that a modified curvature index and a metric representing the number of inflection points best reflected small changes in tongue shapes across individuals differing in vocal tract size. Ultimately, these metrics have the potential to reveal delays in motor skill in young children, which could inform assessment procedures and treatment decisions for children with speech delays and disorders.


Benchmarking , Phonetics , Adult , Humans , Child , Child, Preschool , Speech Production Measurement/methods , Speech , Tongue/diagnostic imaging , Ultrasonography/methods
6.
PLoS One ; 17(11): e0265798, 2022.
Article En | MEDLINE | ID: mdl-36350848

Reluctance to make eye contact during natural interactions is a central diagnostic criterion for autism spectrum disorder (ASD). However, the underlying neural correlates for eye contacts in ASD are unknown, and diagnostic biomarkers are active areas of investigation. Here, neuroimaging, eye-tracking, and pupillometry data were acquired simultaneously using two-person functional near-infrared spectroscopy (fNIRS) during live "in-person" eye-to-eye contact and eye-gaze at a video face for typically-developed (TD) and participants with ASD to identify the neural correlates of live eye-to-eye contact in both groups. Comparisons between ASD and TD showed decreased right dorsal-parietal activity and increased right ventral temporal-parietal activity for ASD during live eye-to-eye contact (p≤0.05, FDR-corrected) and reduced cross-brain coherence consistent with atypical neural systems for live eye contact. Hypoactivity of right dorsal-parietal regions during eye contact in ASD was further associated with gold standard measures of social performance by the correlation of neural responses and individual measures of: ADOS-2, Autism Diagnostic Observation Schedule, 2nd Edition (r = -0.76, -0.92 and -0.77); and SRS-2, Social Responsiveness Scale, Second Edition (r = -0.58). The findings indicate that as categorized social ability decreases, neural responses to real eye-contact in the right dorsal parietal region also decrease consistent with a neural correlate for social characteristics in ASD.


Autism Spectrum Disorder , Humans , Brain Mapping , Brain/diagnostic imaging , Fixation, Ocular , Parietal Lobe
7.
PLoS One ; 17(9): e0272127, 2022.
Article En | MEDLINE | ID: mdl-36107945

PURPOSE: It is well known that speech uses both the auditory and visual modalities to convey information. In cases of congenital sensory deprivation, the feedback language learners have access to for mapping visible and invisible orofacial articulation is impoverished. Although the effects of blindness on the movements of the lips, jaw, and tongue have been documented in francophone adults, not much is known about their consequences for speech intelligibility. The objective of this study is to investigate the effects of congenital visual deprivation on vowel intelligibility in adult speakers of Canadian French. METHOD: Twenty adult listeners performed two perceptual identification tasks in which vowels produced by congenitally blind adults and sighted adults were used as stimuli. The vowels were presented in the auditory, visual, and audiovisual modalities (experiment 1) and at different signal-to-noise ratios in the audiovisual modality (experiment 2). Correct identification scores were calculated. Sequential information analyses were also conducted to assess the amount of information transmitted to the listeners along the three vowel features of height, place of articulation, and rounding. RESULTS: The results showed that, although blind speakers did not differ from their sighted peers in the auditory modality, they had lower scores in the audiovisual and visual modalities. Some vowels produced by blind speakers are also less robust in noise than those produced by sighted speakers. CONCLUSION: Together, the results suggest that adult blind speakers have learned to adapt to their sensory loss so that they can successfully achieve intelligible vowel targets in non-noisy conditions but that they produce less intelligible speech in noisy conditions. Thus, the trade-off between visible (lips) and invisible (tongue) articulatory cues observed between vowels produced by blind and sighted speakers is not equivalent in terms of perceptual efficiency.


Speech Acoustics , Speech Perception , Blindness/congenital , Canada , Humans , Speech Intelligibility , Speech Production Measurement
8.
Front Hum Neurosci ; 16: 879981, 2022.
Article En | MEDLINE | ID: mdl-35911601

Multimodal integration is the formation of a coherent percept from different sensory inputs such as vision, audition, and somatosensation. Most research on multimodal integration in speech perception has focused on audio-visual integration. In recent years, audio-tactile integration has also been investigated, and it has been established that puffs of air applied to the skin and timed with listening tasks shift the perception of voicing by naive listeners. The current study has replicated and extended these findings by testing the effect of air puffs on gradations of voice onset time along a continuum rather than the voiced and voiceless endpoints of the original work. Three continua were tested: bilabial ("pa/ba"), velar ("ka/ga"), and a vowel continuum ("head/hid") used as a control. The presence of air puffs was found to significantly increase the likelihood of choosing voiceless responses for the two VOT continua but had no effect on choices for the vowel continuum. Analysis of response times revealed that the presence of air puffs lengthened responses for intermediate (ambiguous) stimuli and shortened them for endpoint (non-ambiguous) stimuli. The slowest response times were observed for the intermediate steps for all three continua, but for the bilabial continuum this effect interacted with the presence of air puffs: responses were slower in the presence of air puffs, and faster in their absence. This suggests that during integration auditory and aero-tactile inputs are weighted differently by the perceptual system, with the latter exerting greater influence in those cases where the auditory cues for voicing are ambiguous.

9.
J Commun Disord ; 99: 106230, 2022.
Article En | MEDLINE | ID: mdl-35728449

PURPOSE: Children with speech errors who have reduced motor skill may be more likely to develop residual errors associated with lifelong challenges. Drawing on models of speech production that highlight the role of somatosensory acuity in updating motor plans, this pilot study explored the relationship between motor skill and speech accuracy, and between somatosensory acuity and motor skill in children. Understanding the connections among sensorimotor measures and speech outcomes may offer insight into how somatosensation and motor skill cooperate during speech production, which could inform treatment decisions for this population. METHOD: Twenty-five children (ages 9-14) produced syllables in an /ɹ/ stimulability task before and after an ultrasound biofeedback treatment program targeting rhotics. We first tested whether motor skill (as measured by two ultrasound-based metrics of tongue shape complexity) predicted acoustically measured accuracy (the normalized difference between the second and third formant frequencies). We then tested whether somatosensory acuity (as measured by an oral stereognosis task) predicted motor skill, while controlling for auditory acuity. RESULTS: One measure of tongue shape complexity was a significant predictor of accuracy, such that higher tongue shape complexity was associated with lower accuracy at pre-treatment but higher accuracy at post-treatment. Based on the same measure, children with better somatosensory acuity produced /ɹ/ tongue shapes that were more complex, but this relationship was only present at post-treatment. CONCLUSION: The predicted relationships among somatosensory acuity, motor skill, and acoustically measured /ɹ/ production accuracy were observed after treatment, but unexpectedly did not hold before treatment. The surprising finding that greater tongue shape complexity was associated with lower accuracy at pre-treatment highlights the importance of evaluating tongue shape patterns (e.g., using ultrasound) prior to treatment, and has the potential to suggest that children with high tongue shape complexity at pre-treatment may be good candidates for ultrasound-based treatment.


Apraxias , Language Development Disorders , Speech Sound Disorder , Stuttering , Adolescent , Child , Humans , Pilot Projects , Speech , Speech Production Measurement , Speech Sound Disorder/therapy
10.
Clin Linguist Phon ; 36(12): 1112-1131, 2022 12 02.
Article En | MEDLINE | ID: mdl-34974782

Contours traced by trained phoneticians have been considered to be the most accurate way to identify the midsagittal tongue surface from ultrasound video frames. In this study, inter-measurer reliability was evaluated using measures that quantified both how closely human-placed contours approximated each other as well as how consistent measurers were in defining the start and end points of contours. High reliability across three measurers was found for all measures, consistent with treating contours placed by trained phoneticians as the 'gold standard.' However, due to the labour-intensive nature of hand-placing contours, automatic algorithms that detect the tongue surface are increasingly being used to extract tongue-surface data from ultrasound videos. Contours placed by six automatic algorithms (SLURP, EdgeTrak, EPCS, and three different configurations of the algorithm provided in Articulate Assistant Advanced) were compared to human-placed contours, with the same measures used to evaluate the consistency of the trained phoneticians. We found that contours defined by SLURP, EdgeTrak, and two of the AAA configurations closely matched the hand-placed contours along sections of the image where the algorithms and humans agreed that there was a discernible contour. All of the algorithms were much less reliable than humans in determining the anterior (tongue-tip) edge of tongue contours. Overall, the contours produced by SLURP, EdgeTrak, and AAA should be useable in a variety of clinical applications, subject to spot-checking. Additional practical considerations of these algorithms are also discussed.


Algorithms , Tongue , Humans , Reproducibility of Results , Ultrasonography/methods , Tongue/diagnostic imaging
11.
J Speech Lang Hear Res ; 64(7): 2557-2574, 2021 07 16.
Article En | MEDLINE | ID: mdl-34232685

Purpose Generalizations can be made about the order in which speech sounds are added to a child's phonemic inventory and the ways that child speech deviates from adult targets in a given language. Developmental and disordered speech patterns are presumed to reflect differences in both phonological knowledge and skilled motor control, but the relative contribution of motor control remains unknown. The ability to differentially control anterior versus posterior regions of the tongue increases with age, and thus, complexity of tongue shapes is believed to reflect an individual's capacity for skilled motor control of speech structures. Method The current study explored the relationship between tongue complexity and phonemic development in children (ages 4-6 years) with and without speech sound disorder producing various phonemes. Using established metrics of tongue complexity derived from ultrasound images, we tested whether tongue complexity incrementally increased with age in typical development, whether tongue complexity differed between children with and without speech sound disorder, and whether tongue complexity differed based on perceptually rated accuracy (correct vs. incorrect) for late-developing phonemes in both diagnostic groups. Results Contrary to hypothesis, age was not significantly associated with tongue complexity in our typical child sample, with the exception of one association between age and complexity of /t/ for one measure. Phoneme was a significant predictor of tongue complexity, and typically developing children had more complex tongue shapes for /ɹ/ than children with speech sound disorder. Those /ɹ/ tokens that were rated as perceptually correct had higher tongue complexity than the incorrect tokens, independent of diagnostic classification. Conclusions Quantification of tongue complexity can provide a window into articulatory patterns characterizing children's speech development, including differences that are perceptually covert. With the increasing availability of ultrasound imaging, these measures could help identify individuals with a prominent motor component to their speech sound disorder and could help match those individuals with a corresponding motor-based treatment approach. Supplemental Material https://doi.org/10.23641/asha.14880039.


Speech Sound Disorder , Speech , Adult , Child , Child, Preschool , Humans , Phonetics , Speech Production Measurement , Speech Sound Disorder/diagnostic imaging , Tongue/diagnostic imaging , Ultrasonography
12.
J Speech Lang Hear Res ; 64(7): 2637-2667, 2021 07 16.
Article En | MEDLINE | ID: mdl-34153203

Purpose This study compares two electromagnetic articulographs manufactured by Northern Digital, Inc.: the NDI Wave System (from 2008) and the NDI Vox-EMA System (from 2020). Method Four experiments were completed: (a) comparison of statically positioned sensors, (b) tracking dynamic movements of sensors manipulated using a motor-driven LEGO apparatus, (c) tracking small and large movements of sensors mounted in a rigid bar manipulated by hand, and (d) tracking movements of sensors rotated on a circular disc. We assessed spatial variability for statically positioned sensors, variability in the transduced Euclidean distances between sensor pairs, and missing data rates. For sensors tracking circular movements, we compared the fit between fitted ideal circles and actual trajectories. Results The average sensor pair tracking error (i.e., the standard deviation of the Euclidean distances) was 1.37 mm for the WAVE and 0.12 mm for the VOX during automated trials at the fastest speed, and 0.35 mm for the WAVE and 0.14 mm for the VOX during the tracking of large manual movements. The average standard deviation of the fitted circle radii charted by manual circular disc movements was 0.72 mm for the WAVE sensors and 0.14 mm for the VOX sensors. There was no significant difference between the WAVE and the VOX in the number of missing frames. Conclusions In general, the VOX system significantly outperformed the WAVE on measures of both static precision and dynamic accuracy (automated and manual). For both systems, positional precision and spatial variability were influenced by the sensors' position relative to the field generator unit (worse when further away). Supplemental Material https://doi.org/10.23641/asha.14787846.


Electromagnetic Phenomena , Movement
13.
J Phon ; 872021 Jul.
Article En | MEDLINE | ID: mdl-34012182

Vowel-intrinsic fundamental frequency (IF0), the phenomenon that high vowels tend to have a higher fundamental frequency (f0) than low vowels, has been studied for over a century, but its causal mechanism is still controversial. The most commonly accepted "tongue-pull" hypothesis successfully explains the IF0 difference between high and low vowels but fails to account for gradient IF0 differences among low vowels. Moreover, previous studies that investigated the articulatory correlates of IF0 showed inconsistent results and did not appropriately distinguish between the tongue and the jaw. The current study used articulatory and acoustic data from two large corpora of American English (44 speakers in total) to examine the separate contributions of tongue and jaw height on IF0. Using data subsetting and stepwise linear regression, the results showed that both the jaw and tongue heights were positively correlated with vowel f0, but the contribution of the jaw to IF0 was greater than that of the tongue. These results support a dual mechanism hypothesis in which the tongue-pull mechanism contributes to raising f0 in non-low vowels while a secondary "jaw-push" mechanism plays a more important role in lowering f0 for non-high vowels.

14.
Clin Linguist Phon ; 35(1): 19-42, 2021 01 02.
Article En | MEDLINE | ID: mdl-32242467

The rhotic sound /r/ is one of the latest-emerging sounds in English, and many children receive treatment for residual errors affecting /r/ that persist past the age of 9. Auditory-perceptual abilities of children with residual speech errors are thought to be different from their typically developing peers. This study examined auditory-perceptual acuity in children with residual speech errors affecting /r/ and the relation of these skills to production accuracy, both before and after a period of treatment incorporating visual biofeedback. Identification of items along an /r/-/w/ continuum was assessed prior to treatment. Production accuracy for /r/ was acoustically measured from standard/r/stimulability probes elicited before and after treatment. Fifty-nine children aged 9-15 with residual speech errors (RSE) affecting /r/ completed treatment, and forty-eight age-matched controls who completed the same auditory-perceptual task served as a comparison group. It was hypothesized that children with RSE would show lower auditory-perceptual acuity than typically developing speakers and that higher auditory-perceptual acuity would be associated with more accurate production before treatment. It was also hypothesized that auditory-perceptual acuity would serve as a mediator of treatment response. Results indicated that typically developing children have more acute perception of the /r/-/w/ contrast than children with RSE. Contrary to hypothesis, baseline auditory-perceptual acuity for /r/ did not predict baseline production severity. For baseline auditory-perceptual acuity in relation to biofeedback efficacy, there was an interaction between auditory-perceptual acuity and gender, such that higher auditory-perceptual acuity was associated with greater treatment response in female, but not male, participants.


Speech Perception , Speech Sound Disorder , Articulation Disorders , Auditory Perception , Child , Female , Humans , Speech , Speech Therapy
15.
J Speech Lang Hear Res ; 63(6): 1658-1674, 2020 06 22.
Article En | MEDLINE | ID: mdl-32516559

Objective We aimed to investigate the production of contrastive emphasis in French-speaking 4-year-olds and adults. Based on previous work, we predicted that, due to their immature motor control abilities, preschool-aged children would produce smaller articulatory differences between emphasized and neutral syllables than adults. Method Ten 4-year-old children and 10 adult French speakers were recorded while repeating /bib/, /bub/, and /bab/ sequences in neutral and contrastive emphasis conditions. Synchronous recordings of tongue movements, lip and jaw positions, and speech signals were made. Lip positions and tongue shapes were analyzed; formant frequencies, amplitude, fundamental frequency, and duration were extracted from the acoustic signals; and between-vowel contrasts were calculated. Results Emphasized vowels were higher in pitch, intensity, and duration than their neutral counterparts in all participants. However, the effect of contrastive emphasis on lip position was smaller in children. Prosody did not affect tongue position in children, whereas it did in adults. As a result, children's productions were perceived less accurately than those of adults. Conclusion These findings suggest that 4-year-old children have not yet learned to produce hypoarticulated forms of phonemic goals to allow them to successfully contrast syllables and enhance prosodic saliency.


Phonetics , Speech , Adult , Child, Preschool , Goals , Humans , Speech Acoustics , Speech Production Measurement
16.
PLoS One ; 15(4): e0231484, 2020.
Article En | MEDLINE | ID: mdl-32287289

PURPOSE: This study aimed to evaluate the role of motor control immaturity in the speech production characteristics of 4-year-old children, compared to adults. Specifically, two indices were examined: trial-to-trial variability, which is assumed to be linked to motor control accuracy, and anticipatory extra-syllabic vowel-to-vowel coarticulation, which is assumed to be linked to the comprehensiveness, maturity and efficiency of sensorimotor representations in the central nervous system. METHOD: Acoustic and articulatory (ultrasound) data were recorded for 20 children and 10 adults, all native speakers of Canadian French, during the production of isolated vowels and vowel-consonant-vowel (V1-C-V2) sequences. Trial-to-trial variability was measured in isolated vowels. Extra-syllabic anticipatory coarticulation was assessed in V1-C-V2 sequences by measuring the patterns of variability of V1 associated with variations in V2. Acoustic data were reported for all subjects and articulatory data, for a subset of 6 children and 2 adults. RESULTS: Trial-to-trial variability was significantly larger in children. Systematic and significant anticipation of V2 in V1 was always found in adults, but was rare in children. Significant anticipation was observed in children only when V1 was /a/, and only along the antero-posterior dimension, with a much smaller magnitude than in adults. A closer analysis of individual speakers revealed that some children showed adult-like anticipation along this dimension, whereas the majority did not. CONCLUSION: The larger trial-to-trial variability and the lack of anticipatory behavior in most children-two phenomena that have been observed in several non-speech motor tasks-support the hypothesis that motor control immaturity may explain a large part of the differences observed between speech production in adults and 4-year-old children, apart from other causes that may be linked with language development.


Psychomotor Performance/physiology , Speech/physiology , Acoustics , Adult , Anticipation, Psychological/physiology , Canada , Child, Preschool , Female , Humans , Language , Language Development , Male , Phonetics , Sound Spectrography/methods , Speech Acoustics , Speech Articulation Tests/methods , Speech Production Measurement/methods
17.
Front Hum Neurosci ; 14: 606397, 2020.
Article En | MEDLINE | ID: mdl-33584223

Although the neural systems that underlie spoken language are well-known, how they adapt to evolving social cues during natural conversations remains an unanswered question. In this work we investigate the neural correlates of face-to-face conversations between two individuals using functional near infrared spectroscopy (fNIRS) and acoustical analyses of concurrent audio recordings. Nineteen pairs of healthy adults engaged in live discussions on two controversial topics where their opinions were either in agreement or disagreement. Participants were matched according to their a priori opinions on these topics as assessed by questionnaire. Acoustic measures of the recorded speech including the fundamental frequency range, median fundamental frequency, syllable rate, and acoustic energy were elevated during disagreement relative to agreement. Consistent with both the a priori opinion ratings and the acoustic findings, neural activity associated with long-range functional networks, rather than the canonical language areas, was also differentiated by the two conditions. Specifically, the frontoparietal system including bilateral dorsolateral prefrontal cortex, left supramarginal gyrus, angular gyrus, and superior temporal gyrus showed increased activity while talking during disagreement. In contrast, talking during agreement was characterized by increased activity in a social and attention network including right supramarginal gyrus, bilateral frontal eye-fields, and left frontopolar regions. Further, these social and visual attention networks were more synchronous across brains during agreement than disagreement. Rather than localized modulation of the canonical language system, these findings are most consistent with a model of distributed and adaptive language-related processes including cross-brain neural coupling that serves dynamic verbal exchanges.

18.
Front Psychol ; 10: 2459, 2019.
Article En | MEDLINE | ID: mdl-31827451

Movements of the head and speech articulators have been observed in tandem during an alternating word pair production task driven by an accelerating rate metronome. Word pairs contrasted either onset or coda dissimilarity with same word controls. Results show that as production effort increased, so did speaker head nodding, and that nodding increased abruptly following errors. More errors occurred under faster production rates, and in coda rather than onset alternations. The greatest entrainment between head and articulators was observed at the fastest rate under coda alternation. Neither jaw coupling nor imposed prosodic stress was observed to be a primary driver of head movement. In alternating pairs, nodding frequency tracked the slower alternation rate rather than the syllable rate, interpreted as recruitment of additional degrees of freedom to stabilize the alternation pattern under increasing production rate pressure.

19.
J Speech Lang Hear Res ; 62(8S): 3033-3054, 2019 08 29.
Article En | MEDLINE | ID: mdl-31465705

Purpose This study examines the temporal organization of vocalic anticipation in German children from 3 to 7 years of age and adults. The main objective was to test for nonlinear processes in vocalic anticipation, which may result from the interaction between lingual gestural goals for individual vowels and those for their neighbors over time. Method The technique of ultrasound imaging was employed to record tongue movement at 5 time points throughout short utterances of the form V1#CV2. Vocalic anticipation was examined with generalized additive modeling, an analytical approach allowing for the estimation of both linear and nonlinear influences on anticipatory processes. Results Both adults and children exhibit nonlinear patterns of vocalic anticipation over time with the degree and extent of vocalic anticipation varying as a function of the individual consonants and vowels assembled. However, noticeable developmental discrepancies were found with vocalic anticipation being present earlier in children's utterances at 3-5 years of age in comparison to adults and, to some extent, 7-year-old children. Conclusions A developmental transition towards more segmentally-specified coarticulatory organizations seems to occur from kindergarten to primary school to adulthood. In adults, nonlinear anticipatory patterns over time suggest a strong differentiation between the gestural goals for consecutive segments. In children, this differentiation is not yet mature: Vowels show greater prominence over time and seem activated more in phase with those of previous segments relative to adults.


Speech/physiology , Adult , Age Factors , Child , Child, Preschool , Female , Humans , Male , Speech Production Measurement , Tongue/diagnostic imaging , Tongue/physiology , Young Adult
20.
J Acoust Soc Am ; 146(1): 316, 2019 07.
Article En | MEDLINE | ID: mdl-31370597

Speech inversion is a well-known ill-posed problem and addition of speaker differences typically makes it even harder. Normalizing the speaker differences is essential to effectively using multi-speaker articulatory data for training a speaker independent speech inversion system. This paper explores a vocal tract length normalization (VTLN) technique to transform the acoustic features of different speakers to a target speaker acoustic space such that speaker specific details are minimized. The speaker normalized features are then used to train a deep feed-forward neural network based speech inversion system. The acoustic features are parameterized as time-contextualized mel-frequency cepstral coefficients. The articulatory features are represented by six tract-variable (TV) trajectories, which are relatively speaker invariant compared to flesh point data. Experiments are performed with ten speakers from the University of Wisconsin X-ray microbeam database. Results show that the proposed speaker normalization approach provides an 8.15% relative improvement in correlation between actual and estimated TVs as compared to the system where speaker normalization was not performed. To determine the efficacy of the method across datasets, cross speaker evaluations were performed across speakers from the Multichannel Articulatory-TIMIT and EMA-IEEE datasets. Results prove that the VTLN approach provides improvement in performance even across datasets.

...